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ABSTRACT , 

In a study of factors "that influence evaluators^ 
ratings of student papers, 32 student essays wisre rewritten to -sake 
them stronger or weaker in content, .organisation, sentence structure, 
or mechanics; the essays* were then submitted to/€valuatcrs in both, 
their original and rewritten forms to determine the way in which the 
.changes influenced the ratings. This paper discusses the procedures 
used ip selecting the essays to be rewritten, rewriting them, and 
having them- evaluated; it then reports the results of the study, 
examines the .relationship between the holistic ratings and the 
raters 1 perceptions of the papers 1 strengths or weaknesses in each of 
the rewritten categories, and discusses the results and their- 
pedagogical sigqif icance. Among the major findings . were tha't the most 
important influences, on carters* scores were the content and then the 
organization of the essays and that sentence structure and mJachairac.s 
proved to be far less significant influences -on holistic judgments* 
Seven tab'les^are i-ncluded. (GfJ) . 
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. THE EVALUATORS OF STUDENT* WRITING *~ t »oonot ^i^v «■;«■• 

* / f SENT OFFICIAL NATIONAL INSTITUTE OF 

0 EDUCATION POSITION OR POUC~ 

This papir reports the results of an experimental study 

about the factors in college- level Sjtudent papers that in- - 
- ^ < 

fluence judges' rating of the quality of those papers. Most 
7 p>ast research on this topic has b.een correlation^ rather than' 
< experimental . Jn a'^orrelational study the researcher inves- 
tigates 'natural occurrendes. Students write papers; judges 
rate the quality of the papers. The researcher then examines, 
the paper for traits associated with kigh and low ratings. 
One type of correlational study (e.g., Page,' 1968; Hill$r> 
' Marcotte, and Martin, 1969 ; Slotnick and Knapp, 19 71;- Thompson, 
19 76; and • ,' 19 77) .attempted to predict ratings 

with measures of characteristics in the student paper, such 
as the number of spelling errors or the length of the essay. . 
Another tyP e (e . g. > , Diederich, French, ^nd Carleton, 1961; 
and Meyers, McConville, and Coffman, 1966)'. sought to account 
for ratings with characteristics of the judges., 'such as their 
personal biases or their degree of leniency. Tfts—past studies 
show* that characteristics of papers and of judges are asso- 



ciated with or correlated with ratings. However, it is not 

possible for a correlational study to establish- the causal 

influence of papers or judges oh the ratings. 
» . t 

To establish causal relations it is necessary to turn to 

■% » « 

an experimental approach. For instance ^ in an experiment , % 

on the evaluation of composition the researcher might refW^i^e 

student papers to jn^ke them Stronger -or weaker, along sqme |, 

: ' " ■ ■>.' " 2 



dimension of content or form and ttten see how such rewriting 
influences* the N ratings. After the student writes the paper, 
the researcher, instead of observing natural occurrences as 
in * the correlational paradigm^ interferes with nature by ex-, 
,perimentally manipulating the Student paper. Judges then 
rate the quality of the rewritten paper* The researcher, 
Who created certain characteristics in- the rewritten essay, 
can determine the extent to which the 'manipulations influenced 
the ratings* Such an experimental approach is akin to one 

suggested by Hiller, Marcotte/, and Martin (1969): - 

i 

if a given characteristic is present in an essay, 
does that characteristic affect the essay's qual- 
ity as reflected in the grade assigned by expert 
-"''graders? To answer this question we should have 
to manipulate the quality ahd quantity of relevant 
category items under 'an experimental procedure, 
(p. 27if) < - 

For my study, I decided to manipulate characteristics % 

in essays to examine the influence of papers on Ratings. My 

first problem was which characteristics to manipulate. I did 

not base my choice of characteristics on any one^theory of 

discourse". Instead, I selected four very broad, bux^^dagog- 

ically* interesting categories: content , orgariization , sentence 

structure , and mechanics . More precise features, which fall . 

•under these broad categories, such as the number of spelling 

errors or the length of the essay, tiad been the focus of many 

of the correlational studies in the first type cited earlier. 

However, for a first experimental study, I thought it wise to 

manipulate general characteristics so that in future studies 

on the influence o v f characteristics in papers on ratings the 



features of the influential 'general categories could be 

- f 

investigated. ^ 

I next* rewrote^ essays of • moderate .quality to be either 
stronger or weaker in the four categories? of Content, organi- 
zation, sentence structute / and mechanics . / Exactly how to 
perform, the rewriting proved to be a very complex problem 

4 v 

which I discuss in detail in a separate section. 

In almost/ everjrjcorrelational study some aspect of con- 
tent or a marker of content (e..g. , essay length) predicted 
ratings. Based on this finding I posited one hypothesis 
about the effects of the rewriting: ' essays rewritten to be 
strong in content would be rated significantly higher than 
those rewritten to be weak in content . " The findings of past 
studies about the relationship between judges' .ratings and 
the quality of the organization, sentence structure, and 
mechanics were not so consistent, making ij^ifficult to 
predict the potential effect of rewriting Sun these three 

categories. Nevertheless, my experiment would. allow me to 

/ * 
determine the effects of these pedagogically interesting 

characteristics on ratings too. - 

College students in two different required writing sec- 
tions at each of four Bay Area colleges wrote essays fdr "the 
study. The colleges., which ranged in type from highly select 
private schQols to open-admiss ions / public schools, provided 
V writers representing a wide range of abilities. According to 

Cass and Birnbatim's (1972) most recent descriptions of adirtis- 

. / * ' * 

sions criteria, the schools in order from most to least * 

selective admissions requirements wer&: Stanford University, 



University of Santa tiara, California State University at ? 

* v Hayward, and Sah Jose fcity College. The classes at each 
school were obtained on the recommendation of the department 

^ chair who 'was asked to suggest two "typical" classes taught 

by different teachers. ~* « 

Students wrote tire essays in class on one of eight 

topics designed to elicit essays in the argumentative mode 

of discourse. The t6pics either asked students to compare 

cCnd contrast two quotations or' to a^gue ^heir opinion on a 

current, controversial issue. AsAimple of each type of 

topic follows : \ 

1. "A Founding Father said: "Get what you can, and what 

you get hold; f Tis-the 
Stone that will turn 
ajl your Lead into Gold." 

I 

- ^"A contemporary writer said J* "If it feels ijood, * 
> - f do it."l 

What do these two statements say? Explain how they 
are alike and how they are different. 

12. President Ford gave Nixon an "unconditional pardon," 
Do you agree or disagree with Ford's decision? Give 
reasons for taking your pospfion. 



A student ^TR^ti^g on one of eac^of the eight topics was 
selected from each class to participate in .an earlier study . ^ 

0 "* 
% " 

The papers of these same eight students from each class were 
used as .the basis for the rewriting/in this study. In all, 
there were eight student essays on each of eight topics, a 
total of GH papers. In the earlier study, fotfr judges rated 
each essay holist ically . Of "the eight student essays on each 
topic,, the four rated to be most average .in quality in the 
earlier study were selected for experimental rewriting 'in this 



I - 



study. The other four, which jwe re not ^rewritten, were . the 
two which had been rated highest and the two which had been 
rated lowest on each topic »in the earlifer study. These nonr 
rewritten essays served to^establish the reliability of" the 
ratings in this study. 

* •« 

REWRITING Because of the. dearth of operational definitions for 

METHOD , ^ - 

strength and weakness of content % organization, sentence 

c . ^ . * , 

stru6tur§, .and mechanics, I poridered, < at first, how to under- 

take the rewriting task. r ' I decided on the^set of procedures 

in Table 1. 



sert Table 1 about here* 



To validate the rewriting procedures, I trained two different 
students to rewrite. If the two students and I as. independent 
rewriters produced no significantly differeivj: results in essay 
•ratings, I* then -cbuld obtain a measure of the effects of re- 
writing the four categories to be weak or strong on the 
ratings of the .essays. Furthermore, the fact'thAt it would 
be possible to train others to' follow th-e rewriting procedures 
consistently indicates that the rewriting could be replicated. 

Rewriting the content category to be wjeak brought one 
major, constraints When the content was made weak, the organ- 
ization coui'd never be made strong'. It would have been an 
exercise in absurdity to attempt. to order illogical ideas 
logically or to order and transition appropriately a group ol 
inherently unrelated ideas. \Thus, there were* twelve possible 
rewriting combinations : 
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C =* Content ' + = Strong 

0 = Organization . ^~ 

SS .= 'Sentence Structure < - = Weak 
M = Mechanics 

(1) +C, +0, +SS, +M 

(2) < +C, +0, +SS, -M 

(3) + C, +0, -SS, +M 

(4) +C, +0, -SS, -M 

(5) +C, -0, +SS, +M 

(6) +C, -0, +SS, «-M 

(7) +C, -0, i-SS, +*M 
• (8) +C, -0, -SS, -H 

(9) -C, -0, +SS, +M 

(10) -C, -0', +SS, -M . 

* Ml) -€, -0, -SS,. +M . 4 

(12) -C, -0, -SS, -M -<* 

As rewriters we had a. commitment to create' a revised* 

paper that retained^ insofar as possible, the sense of the 

# 

original essay. We attempted to highlight, the strengths and 
weaknesses in each category x in- each paper. Nevertheless, the 
act of highlighting often produced a new paper substantially * 
unlike the original. In spite of how unlike the original a 
rewritten version* became , we remained committed to rewrite 
papers to be like the papers students actually produced. 
Still, the rewriting aimed to reproduce only the reasonable 
extremes of strength and weakness for each category. Papers V 
were never rewritten to be average in any category. 

The rewriting was performed in layers: content first, 
then organization, then sentence structure, and finally 
mechanics. When an earlier layer, was rewritten as strong and , 
a .later one vms rewritten as weak, the rewriters had Xp/be 
extremely careful not to' obscure the strength of the- earlier 
category with the weakness of the latter. When rewriting 
content to be strong, weaknesses in organisation, sentence ^ - 
structure, qr mechanics were not allowed to % obscure the ideas 



and the development of those ideas.' Similarly, when rewriting 

sentence structure to be strong, weaknesses in mechanics were* 

not allowed to obscure the strength of the 1 sentences . .) 

» *. « 

Finally, the four fcroad rewriting categories were defined 
"to include all possible specific features in an essay that" , / 
relate to its quality. "Thus, If a composition was rewritten 
to be strong in every broad rewriting cate gory , * then it would 
have no residual weaknesses. 'Likewise, if* a composition was 
rewritten to be weak in every category, .it would have no 
.residual strengths. Because I used only four category he&d- % 
ings, some features related to essay quality did not fit under 
any particular category. For example, the feature word choice' 
seemed to fit under none of the category headings. In fact, 
word* choice fit under bpth the content an.d the sentence struc- 
ture headings. Some changes in word choi'ce affected the 
clarity of presentation of an idea; they were included under 
content* Other changes affected the parallel structure of a 
sentence; they were indluded under sentence structure. Other 
changes, which were purely matters of diction, were arbitrarily 
placed* under sentence structure 

This ^section discusses the plan for rewriting the four 
, student papers on each of the eight topics. First, each of 
the papers was rewritten in three different versi<?ns each. 
Each original essay was keyed to three of the twelve possible 
rewriting combinations listed earlier. The four essays, each \ 
rewritten in three versions,, made twelve Version* on each 
- topic. The twelve rewritten versions on eact) topic .represented 



/ 

REWRITING 
PROCEDURE? 



! 4 EVALUATING 
"' DESIGN 



the' twelve, possible rewritten versions. Across the eight 
topics, with twelve rewritten versions per topic, there were A 

* f 
»■ * # 

96 rewritten papers. 

In tKe end, because of* the constraint against combining 
weak content and strong organization", two-thirds of the 96 
rewritten papers (N= 64) were strong in content; o'ne-third 

* (N = 32) were strong in organization" Half (N = 48) were strong 
in sentence Structure, and half were strpng in mechanics. 0f„ 

• course, the. remainder for each category* was we^ak in that 
category. * ' 

Two Stanford University sophomores helped the'investi- 

- t . . - ; < 

gator perform the rewriting ia return for course credit. All 
rewriters first* practiced applying the Operational* definitions 
for* strength and weakness -in the 'four qategori.es (Table. 1) -to 
training essays, in order to establish ajid define ^common \ 

ground as readers and writers. During practice all rewriters 

i *( * 

independently rewrote the same essay according to the same 

i 

rewriting combinations, then exchanged rewrites and discussed 
points of agreement and disagreement. During th£" actual re- 
< writing\^one rewriter always wrote all three versions of -an 
essay. A second rewriter checked^the rewriting, and~the third 
remained uninvolved. 

Twelve evaluat9rs were chosen according to the following * 
criteria: (1) strength of professional recommendations, 
(2) quantity of teaching experien&e', and (3) educational back- 
ground.. Ali were highly recommended teachers on the staff of 
Stanford's freshman English program. I placed the* evaluators 
into three types from most (type 1) to least (type 3) teaching 
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the^twelv^ pos§ibl'e rewritten versions. Across the eight 
topics, with twelve^ rewritten versions per topic, there we*e 
96 rewritten papers'. 

In the end,' because of the constraint against combining 
weak content and strong organization, two-thirds of the 96 
rewritten papers (N = 64) were strong in content; o'ne-third 

(N<= 32) were strong in organization^ Half (N= 48) were, strong 

' \ • * *\ ' > 1 V ' 

in Sentence structure', and half were strong in mechanics. Of, 

* * ■ * # 

course, the. remainder for each category* was w^ak in that 

category . * ^ 

Two Stanford University sophomores helped the investi- 

gatpB perform the rewriting in return for course credit. All 

1 * { 0> . . . 

rewriters first* practiced applying the Operational- definitions 
for* strength and weakness -in the /four categories (Table .1) to 
trailing essay?, jLn order to esl^bli^H and defjLne common \/ 
ground as readers and writers. During practice all rewriters 
independently rewrote the saine essay according to the same 
rewriting combinations then exchanged rewrites and discussed 
points of agreement and disagreement. During th^ f actual re- 
writing^ one. rewriter always wrote a^Ll three versions of -an 
essay. A second rewriter checked *the rewriting, and ""the third 
remained unin'volved.' 

Twelve 'evaluates were chosen according to the following* 
criteria: (1) strength of professional recommendations, 
(2) quantity of teaching experienfce, and (3) educational back- 
ground.. Ali were highly recommended teachers on thf staff of N 
Stanford 1 s freshman English prdgram. I placed the* evaluators 
into three types from most' (type 1) to least (type 3) teaching 



experi^ce and education. Evaluators were divided irfto four ; 
reading groups of three judges each*. Each,jgroup rated' essays , 
on two of -fche e,ight topics, ^The different types of evaluators 



were 



* i i • * " # 

balanced a&poss t]^ groups in order to avoid* placing\a 



giotip ot less experienced evaluators together*. 

Training and reading packets were compilejdT for eacjh ratei? 
r for each topic. The training packets contained holistic scor- 
ing forms and two training essays typical of those in the 

r 

experimental set. In the reading packets two supplemental 
traihing essays were "followed by eight experimental student 
edsays. Of the eignt experimental essays all three evaluators 
in each group received the four essays thai: had not been re- * 
Written. The four remaining essays in the experimental set 
were selected for each judge from ^those that had been rewrit- 
ten. Each of the ttyree evaluators received orie of the three 

versions 0f each of -file four rewritten essays. The rewritten 
***** 

versions were assigned to evaluators according to a balanced 
plan. The order of the eight experimental essays was random- 
ized>for each evaluator. * 

The evaluations took place on four consecutive days. One 
group of three evaluators rated essays on two of the eight 
topic! on the first day; a second group of three evaluators 
rated essays on another two of the eight topics on the second 
day, and so on. Each group of evaluators was informed that 
college students hiSl produced the essays. The fact that some 
essays had been rewritten was concealed from the evaluators. 
All essays were typed. 
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Before -rating any essays the grpup of ^valuators discussecl 

their expectations for a good essay on the first topicr they 

ttould r^te. Then, they ratfed the first two training essays 

from the training packet^ using the four-point, holistic scale. 

After rating the ''training essays, they discussed with thp 

trainer and with each other their reasons for assigning the 

scores they did. If they evidenced a. difference v of two ojs? 

more points on the four-point holistic scale, the trainer 

« 

tried,, to guide them to understand and reconcile their differ- 
ences. Raters were never forced to agree. 

After discussing these training essays, the evaluators 
received their reading packet on the first topic. and began the 
holistic ratings. If the judges disagreed with one another , 
on scores for the optional training essays in the reading 
packet, the reading was interrupted to continue training with 
these optional training assays. This same procedure was 
repeated for the second* topic. 

The group of judges first gave holistic evaluations to 

■ 

all essays/6n both topics. After completing both holistic 
evaluation* sessions, thf judges were asked to provide a more 
detailed , evaluation for the rewritten essays on each topic. 
For these essays, the judges had to determine whether the* 
content, organization, sentence structure, and mechanics was 
weak or strong. The fact that these essays had been rewritten 
to be weak or strong in these four categories was still con- 
cealed from the judges. 
RELIA3ILITY To assess the reliability of the judges 1 ratings, I used 

the .Cronbach alpha (Cronbach, 1970, p. 15 9; Calfee and Drum, 

j* * 

ERIC 11 • 
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Before rating any essays the gxpup of ~e valuators- discussed 

their expectations for a good essay on the first topic they - 
would r^tB* Th^n, they ratfed the first two training essays 
from_ the-^bratning packed, using the four-point, holistic scale. ^ 
After rating T:he 'training essays, they discussed with thp 
trainer and with each other their reasons for assigning the 
scores they did. If th§y evidenced a ^ difference, of two or 
more points on' the four-point holistic scale, the trained * 
tried to guide them to understand and reconcile their differ- 
ences. Raters were never forced to agree. 
1 After discussing these training essays, the evaluators 

received their reading packet on the first topic, and began the 
holistic ratings. If the judges disagreed with one another 
on scores for v the optional trailing essays in the reading 
packet, the t reading was interrupted to continue training with 
these optional training essays. This same procedure was 
repeated for the second* topic. ^ - - 

The group of judges first gave holistic evaluations to 
all essays * on both topics. After completing both holistic 
evaluation sessions, the judges were asked to provide a more 
^detailed evaluation* for the rewritten essays on each tppic. 

» r 

For these essays, the judges had to determine whether ^the 
content," organization, sentence structure, and mechanic^ was 
weak or strong. The fact that these essays had been ^rewritten 
J to be weak or strong in these four categories^ yja^still con- 

^ ' [ ; 

cealed from the judges. J 

' ( 
RELIApiLITY 'To assess the reliability of the judges % \ ratings , I used 

the .Cronbach alpha (Cronbach,. 1970, p. 159; Calfee and Drum, 

ERIC . 11 . 
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1976,' p. .14). The- reliability „ for the ratings given by each 

* * * 

group of judges was determined by comparing' the ratings the 

different judges Un a group assigned to 'ttie four papers on 

' . - % & . '' .V 

each topic that had not been *ewt>itten. All ratings proved 

■. «• * 

highly reliable. The reliability scores within each group of 
ratrers ranged from .86 to .96. Thus, these reliability scores 
for the non-rewritten papers sugges^^iat the ratings of the 
rewritten papers were also quite reliable., • . * ^ 

I first examined the main results of the experiments the 
effeots of rewriting 'strong and weak content, organization, 
sentence structure, and mechanics on the raters 1 holistic 
scores. t . With an analysis of variance, I measured whether^ 
each rewriting characteristic contributed significantly to 
the- difference in the scores the raters gave (Table 2), My 
hypothesis, that e-ssays rewritten to be strong in content 



\ 



Insert Table 2 about, here 



would be rated significantly higher than those rewritten to 
*be weak" in content, was confirmed. The largest main effect 

v of the rewriting was for the content variable. The 6rgani N za-- 
tion variable also proved to have a highly significant effect 
on the judges 1 scores. Mechanics too had- its effect. Addi-' 
tionally, there were significant interactions between organi- 
zation and mechanics and- between organization and sentence 
•structure. * • 

Table 3 helps explain these main results. It presents 

' the mean scores for papers rewritten to be strong and those 
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Rewritten to* be weak in each of the four* rewriting categories • 



j 



, , Insert Table 3 about here 



It reveals that the difference between the average 



'papers weak in 



content and the average ,score~ given 




strong in content differed by 1.06 points,^ Since %he } maximum 
possible difference between the .average scores was 3^p6ints 
(on the-1 to 4 holistic scale*), a difference of over one 
point is quite large, • Strong versus weak rewriting in organ- 
ization also l,ed to 1 a difference of about 1 point. The effect 
of mechanics and sentence structure rewiring was about 1/2 
and l/4>]boint, respectively. » 

The interactions between organizatioi^and mechanics and 
organization and sentence structure iruihese main results 7 * 
show that only if the essaj^had strong organization did, the s 
^Ispen^th or weakness of tfce mechanics 4 and* sentence structure 
matter (Table H\. If the organization was strong, the 

* k > i - . 1 liu. - \ • *- 

: t 3 * 

# * Insert, Tablfe U about lie re * 



. mechanics Rewriting caused almost an entire ^point cfif ference,. 

between the strong and weak essays' average scores. In the 

same situation, sentence structure rewriting caused -about a 
,1/2 pqint difference. The' relation between organization and 

mechanics was more significant than thdt between organization; 

and sentence* structure^ 

^In summary, the main results of the rewriting showed that 

* ' ' / . , ■ * • 

the jg£st significant influence on raters 1 scores 'is tne 

* * * * * * 

6 AO . . * <\ . . - , 
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strength q| the content of the essay. t . The. second mo'st import-. 
*'* a^t^influ^^» roved to 'be the strength of the organizatio^ $gt 
pf that Sowxefft . Tlje third significant influence was^ the 
' : rftrj^ngihLof the } mechanicsT . Furthermore,' the strength ■ a^^ fte 

mechanics was most important when the organization was strong, 

* ' ' . . . . , 

, and because the sentence structure alone was insignificant, 

' the strength of the" sentence structure was important only 

when' th% organization w^ts^strQng. ^ / , x 

EVALUATORS' I h§xt prepared to examine a secondary set of main 

PERCEPTIONS . v X) 

REWRITERS T results. Instead of using the- actual rewriting as £he in- 

INTENETOHS ' « ' ' • " . . . " 

•W ^ ^pendent variable, I wished to e*ainine the holistic ratings 

according to the raters' perceptions of the strength or weak- 
ness of each of the rewritten categories. ^The raters* per T 

* i / • 

ceptions were determined by tKeir indication of their judgment 

*"* - of the strength or weakness of (the rewritten categories of the 
■* . rewritten essays.' However, before * I could examine the results 

r • using the raters' perceptions it waS first jiecessary to measure 

.how well the raters' perceptions of the strength or weakness 
of the rewritten categories matched with the wa£y the rewriters 
intended to rewrite , them, "if the match was exact v there would 
% . "be no reason W seek these secondary results. Since the cate- 

gories were rewritten to be extremely strong or weak, I expected 
the raters to perceive the rewriting 'accurately for the most 
part even "though they were not given the criteria for thef 

$ rewriting. * * } • 

Table 5 specifies the overall percent of match and, mis- 
match for each category. Raters usually judged the strength 



and weakness- of the^ categories accurately, although they did 



Insert Table 5 about h§re 

• * ■ _J 

not always. The content category prqved most difficult for r 
the raters to assess; organization was next in difficulty 
• followed by sentence structure and then mechanics. This order 
seems quite logical; the evaluators 1 overall perceptions of 
the different categories matched with the rewriters 1 int*h- 
tiorfs a Ibwer percent of the time for the more difficult to 
define* abstract* categories than for the more objective,' 

concrete categories. ' 

• «t * * 
EVALUATORS 1 Since the evaluators 1 perceptions ^of the quality of -the * 

PERCEPTIONS ^ * . . • . ' Y'' 

AND THEIR content, organization, sentence structure, and mechanics* of 

HOLISTIC / m * / 

EVALUATIONS the essay did. not match the rewriters 1 intentions exactly, I 

* ^ % y 

next examined the secondary set of major results/ the relation- 
c * ship between raters 1 perceptions and their holistic scores. 

\ The evaluators 1 perceptions of the ' strengt^ weakness of the 
bontent> organization, sentence structure, and mechanics became 
v * the independent variables in th% analysis of variance rather 

thcui t^6j;?actual**re writing for the catejgories. Table 6 shows 



v * y 



/ / 

Insert Table 6 about here . 



that the results for content and organization were similar to 
those found in the. main results detailed earlier. But other 
findings proved different. Perceived mechanics, this time, 
did not contribute significantly to the evaluators' scores; 
perceived sentence structure did. None of the perceived 



quality Categories 'interacted significantly with one another. 
Table 7 shows, .a comparison" o/ the average difference between 



Insert Table* 7 about here* 

ratings on the perceived sirrong and weak level of each cate- 
gory across all of the rewritten essays. 

In the interpretation of the results, several areas 
deserve mention. First^ all raetKods'of analysis show the most 
important influences on. the raters. 1 scores were the content 
and then the organization of the ess^y. These two aspects of 
the written text merit t£e special attention of the writing 
student* t'eacher, and researcher. Sentence structure and. ' 
mechanics proved much less significant influences on holistic 

• * . 

judgments.* . - 

Because the influence of sentence structure and mechanics 
* * 

are heiJiher as s^rorrg nor-as consistent as the influences of 
content *and organization, raters are probably less conscious 
of the effects of these less important influences. The effects 
of sentence structure aax^ mechanics and the interactions of 
/ these^categories with Organization differ between the analysis 
using the actual rewriting as the dependent measure and the 
analysis using the 'judges 1 ' perceptions of the quality of the* 
rewritten categories as the dependent measure. The differ- 
ences suggest that the judges 1 perceptions would have them 
claim, that their' hbliateic ratings were not weighted on the 
.rewriting categories in the ways the analysis according to, the 
rewriting shoy^d them to be. Raters seem -to perceive that 
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^they^give: ,(1) legs* credit for the. conventions of standard 
edited English, (mechanics )j (2) more credit for* well-formed, 
graceful sentences (sentence structure); and (3) discrete 
credi^ for* the <fotir rewriting Categories. / 

Two rateri" were disqualified *from the research because' 
the frequency of their mismatch was mQr© than two standard 

deviations^ ab'Qve the mean. These rar£ers also exhibited^ • 

p. . 

diMarent pattern of mismatch from the others. ' They mismatched . 
' jgn all" Categories , and the^" mismatched" piore than the others on 
tlie more objective categories, mechanics and sentence struc- 
ture. The raters who did not show frequent mismatch tended 
to .cluster their mismatch on content or organization, mis- 
matching mostly on only one category. Perhaps raters 1 abiL- 
ities to perceive the quality of rewritten categories within 
essays could be used to test .their competence before choosing 

them to participate- in evaluation projects. 

* # * 

The raters, both in their mismatch patterns and with 

"their holistife^scores , showed a 'significant tendency to 

evaluate ^t\x<^^t&J writing negatively. In all categories wheri 

their |>£rc£ptions did not malfch the rewriters 1 'intentions, 

they judged strong rewriting as weak more of the time than 

not. Also, the distribution of the holistic scores was skewed 

' toward the lower^end of the scoring rai\ge. Gon-lan (1376) at * _ _ 
Educational Tasting Service corroborated tlhis tendency of 
readers to rate negatively, "Unfortunately, no reader — 
experienced or inexperienced — seems to need assurance about 
giving out 2's'and l* f s. [lowest ' scores, on four-point holistic 

. scale]; wjaatall readers seem to ne^d from time to Irime is tl\e 

V 
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reminder that not all the papers are '2' papers or ' 1' papers n 

(p. . Perhaps evaluators should be less reluctant to com- 

*. • * 

pliment student writing. / 

•> * - . * 

One ^Limitation of this study ds the difficulty in inter- 

pre ting the % exact results of the rewriting. When each category. 

was* rewritten, several asjg^cts of the category were rewritten 

at once. The- exact aspects, of the category which influenced 

raters 1 reactions to that particular category remain unk^own^ I 

and are a- topic for further study. It is possible that the 

i 

raters reacted to the Rewriting. of all- of the aspects for each 
category. It is equally possible that they reacted to some 

part or combination of parts of the rewriting. For example, 

1 t) 

perhaps order but not transitions was what influenced )fi£?&VS 
in the organization rewrite. Broad areas of influence on 
yaters* judgments have been identified ; the more pi?ecise 

influences need to be examined. , . * * 

* \ # ' . j 

A second limitation i& the homoger^ity of the raters in' 

this study. They were carefully *de'fiflfecl as a select, homo- 

geneous group of college writing* teachers from a major univer-^ 

sity. It would be interesting^"|:^i^arn how other raters wojj^rd 

react. Joseph Williams (1977) > ^rewriting essays in nominal 

and verbal styles, compared the re^onses of severaT^^ypes of 

evaluators who thought they were evaluating for different 

reasons. His judges included new graduate students in a 

Master of Arts in Teachir/g program, experienced college English 

professors, and evaluators who regularly read essays for. a 

•state proficiency examination. Some evaluators thought 1:hey 

were helping a felfbw graduate student with a research project; 

■ is 



others thought they were determining the reliability of a 
college writing examination,. * He found that different types 
of raters preferred different types of essays. Some groups 
"pref erred a nominal style; others ^perf erred a verbal style. 

'If society values content and organization as much as the 
raters in this project did, thei} according to the definitions 
of conten^' and organization used in this study, a pedagogy for 
teaching writing should aim first to help students devfelop 
their ideas logically, .being sensitive to th(e appropriate 
amount of explanation necessary for the audience. Then it 
stould focus 'on- teaching stud^iJfes to organize the- developed 

ideas so that they would be Easily understood and favorably 

» -i 
evaluated. The interaction between organization and mechanics 

* * * 
and organization and sentence structure,' shqwing that thfe 

quality of the mechanics and sentence structure matter most 
when the organization is strong, points even more strongly^to 
a pedagogy aimed at teaching the skills of organization before 
or at least alongside those of mechanics and sentence structure 

It seejhs today that many college level curricula begin 
with a fqcus ori helping students correct mechanical i^id syn- ? 
tactic problems rather than with the more fundamental' aspects 
of *the discourse. It is important to supplement 1}hese curri- 
cula with carefully planned curricula for teaching content and 

orgariization. Certainly, because of the excellent research 

, *■ 

in the. area of written senten.ce . structure (Huh.t, 1965 ; Mellon, 
196 9; O f Hare v 1971; Ctiristensen, 196X? ^beicause of the ob- 
jective nature of mechanical rules fot* standard edited 
English, sentence structure and mechanics have become easier 

- • . 'Id-. 
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to teach than content and organization. The English profes- 

y ' ' ' 

sion knows more about teaching, evaluating, and doing research 
on Sentence * structure and Mechanics than on the less objective # 
areas of content and organization.^ Conceivably, instruction « 
in strengthening sentence structure or mechanics could result 
in/ strong content or organization. But such a hypothesis has 

*not been tested. N ^ 

x Scholars ike Donald Murray (1968), Ken Macrprie (1970>, 
and Pet((r E^^w (1973) have advocated coll$g£ writing curricula 
centered aro/und the 1'arger levels of the .discourse. However, 
Although Murray, Macfrorie, and Elbow offer pedagogipal^'ug- 
gestions for encpuraging students to find and expand their 
ideas,- they do not offer ds complete as well-defined a t . 
pedagogy as, say, Christ ensen does for syntax in The Christeifcen 

' Rhetoric Program (1968). Other sch^ars, like' Kenneth Burke 

(1945), D. Gordon Rohmann (1965), and Young, Becker , % and Pike 

(19 70) .have contributed to developing a modei?n theory of in- 

mention. Youngs Becker, and Pike, -in particular, have devel- * 

oped heuristic procedures forXJielping students retrieve, 

analyze,' and order their' ideas Vo'r a' particular audience. 

Besides such work in invention, with pedagogies * focused 

primarily on idea generation, mare research focusing on how to 

analyze, teach, and evaluate' the logical development of the 

already generated ideas (content) and the techniques used fon 

ordering and making transitions between thos^ ideas (organi- 

zation) 'is badly needed before more concrete pedagogies can 
- , * % 

evolve i \ 

\ 
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CONCLUSIONS 



V 



"The methodology employed in this experiment proyides a 
framework for studying the evaluation of student writing in, 



many other contexts. Certainly the 
evaluation process deserve attention 
(1) 
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ollowing aspects of the 



(2) 



(3) 



(4) 



the more exact effects of* pie rewriting (what 
within the categories influences the evaluators, 
does the influence .work in [a continuumy-if so, 
where are the critical spots on the continuum?); 

evaluations "given ijy different kinds of eValuators 
(e*g-> ,peers, classrooip teachers with varyihg 
amounts of experience who teach different sub-je. 
to different ages, teachers from non-mainstream 
cultural groups , teacher t train6rs} ; 

the evaluation of papers written by students from 
other age groups (elementary through sejiior.high 

SChOOl) J : 

the evaluation of 'papers written in other modes 
Qf discourse (at least narrative or some expressive, 
modes of wy<iting). . ! ' - 

I believe' a more in-depth and mor$ precifee investigation 

of the aspects within the two most influential rewriting- 

categories, content and organization, is tfce most important 

and the most promising area for future research. In th 

study much of the rewriting in these categories was don 

.intuitively. Now that some^aspects of content, and orgai^iza- 

tiorv have been proven powerfivl ^influences on evaluatfcrs 1 

Lt 

jiMgmenys, the precise aspects' of content and organization** 

that influence e valuators must be explored more cardfuliyl 

- ' * ^ J 

Schemes for the linguistrc analyses of texts (e.g. Kiiytsch,, 

1974) might provide a foundation "for more careful experimen- 

tation in these aspects of writing. Out of such explorations 

^ a sound basis for developing curricula focused on teaching 

the skills of content, and organization can evolve-. 
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By using experimental research to learn more about the 
evaluation process, educators will be able to develop more 
efaic'zent and fairer means ;of evaluation. Teachers as well 
as researchers need to know how to evaluate the quality of ^ 
student writing. Discoveries of the bases of evaluators 1 
responses ' will contribute to a set of definitions of what 
evaluators see* a£ go<|d writing, ^ese definitions then can ■ 
be examined critically and those c'riteria* of . good writing that 
seem sound can be incorporated .into pedagogy and iijto trailing 

s 

evaluators of student writing. 0n£ of the first -steps in 
', improving 'the evaluation and teaching of student writing is 
understanding how evaluators evaluate/as they do. 




it 
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TABLE 1 



REWRITING RULES 



Content 



Strong 



Weak 



Delete all misinter- 
pretations af 
quotations; add 
sound reinterpre- 
tations# 



Retain all misinter- 
pretations of , 
quotations; a£d oae - 
misinterpretation 
none are present* * 



Delete icteas not 
relevant to the topic 
unless they can be 
made relevant* If no 
ideas in the paper 
are relevant/ either 
justify their 
inclusion or pull 
together possible 
relationships* 



Retain all ideas pot 
relevant to'the topic, 
Do not add extra 
irrelevant ideas* 



Delete repetition of 
entire arguments* 



* Include repetition of 
entire arguments. 



Take remaining ideas 
and: develop, resolv 
logical contradictions 
within ideas , clarify 
(this involves changes 
in word choice) • 



Take remaining ideas 
and: delete development, 
include contradictions 
within ideas, make 
ideas unclear and 
ambiguous (this involves 
changes in word choice) • 



♦"Include" is used throughout this Table to me^in retain 
and/or add* - - , 
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TABLE 1 — continued 



Organizatipn 



Strong ; 



* Weak 



1. Paragraph appro-*;' 
priately.^ 



_ 4 



Include three m^s- 
paragraphlngs per 
250 word page.. 



„2. Order ideas logically, 
Respect rules of 
given-new informatipn. 
Keep main* ideas to- 
gether. 



Violate logical order 
by separating develop- 
ment of a main idea 
(three times per two 
pages) . violate 
given-n^w strategies. 



3. . Include appropriate 
• inter and intra 

paragraph transitions 
repeat Jcey wor<3s 
* ajid use transition 
* words and phrases 
appropriately*-. v 



Delete inter and 
intra paragraph 
transitions: vary * ' 
the lexical items ^ 
chosen for key words 
and avoid usin§ "* 
transition words and 
phrases appropriately. 
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TABLE 1— continued 



Organization 



Strong ; 



Weak 



1. Paragraph appro-**" 
pr lately. 



1. Include three mis- 
paragraphings per 
250 word ,page^. 



2. Order ideas logically. 
Respect rules of 
given-new information. 
Keep ma in ' ideas to-: 
gether. « 



Violate logical order 
by separating develop^ 
ment of a main idea 
(three times per two 
pages) . Violate 
given-n^w strategies. 



3. 



Include appropriate 
inter and intra 
paragraph transitions; 
repeat Jcey worpls 
and, use transition 
words and phrases 
appropriately. 



v. 



Delete inter and 
intra paragraph 
transitions: vary 
the lexical items 
chosen for key words 
and avoid using ~ 
transition words and 
phrases* appropr iately< 



< " ' 
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TABLE 1 — continued 



t . * ^ ".Sentence Structure - 

' StTOng ' . ■ ' Weak ' " f " . - 

1. Combine and balance I !U Achieve an immature * 
sentences to achieve I syritact.ic style: Include 

.' a mature syntactic* | /simple* primer se'nt- 

" style: reduce number I fences* ( include muph 

; . : of compound sentences; f compounding) or include 

, . v untangle awkward and I . long, rambly /Jancon- 

/uhclear- sentences, I trolled , awkward sentences 

--include final free . | delete graceful parallel- 

modifiers and graceful I ism,, include 1 

'. parallel structur^s^* j verboseness on) the v 

\ . \ " I sentence leveK 
L — 7 '&r~ t * ^ I 

2. ; Vary sentence , | 2. Include sentence 
/structure,* - ' , fragments and run ons. 

.3. f'Incfude at least one I 3. Delete advanced * — * 

^advanced punctuation I punctuation marks z' 

* °taark: semico^n-or | semicolon or c^lon. -\ 

m l colon. ^ j , j 

*4T~ Use appropriate tense* I "4. - Use inappropriate tense 
, , * and reference/ between - I and reference between * 

and within sentences. j and within sentences. ♦ 

'5. Change any misused 9 | N 5* Include misused words. , 
words. Do not alter * I 

^ overall vocabulary- | 
level. 1 * 

• ■ 




TABLE 1 — continued 



Mechanics 



Strong 



Weak 



Follow conventions of 
standard edited • 
English,. 



1. Commas • Violate at least 
three of the following, 
rules: 

Comma before conjunction 
in „ compound sentence • 
Comma after introductory 
adverbial clause. 1 
.Comma within 
quotation marks., Commas 
between words and phrases 
in series. 




Quotation marks. Overuse\ 
and use inconsistently. 
Use to emphasize words. 
Forget to either open or 
close quotations. 



3. Possessives. Misuse n, s." 
Omit when needed. Use 
structures like "their 1 s. n 



pi ope 




talization. Omit for 

r names. Forget to 
"tafize first word of 
ences. Add inappro- 
iately for emphasis. 



Ur derl i'ning . Overuse and 
use inappropriately for 
emphasis.' ' ' * 



Spelling. Include four or 
five errors per page. 



The operatidnal definitions, the general rules ve followed for rewriting 
all four categories to be veak and strong/ vere adapted from descriptions 
on analytic rating scales (Diederich, 197^; Adjer, 1972; i977), 
were based on definitions used in past correlational research on readers' 
responses (Thompson, 1976), and also vere based on critical analyses of 
the strengths and weaknesses vithJLn thp student papers written for this 
study. 
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TABLE 2 






> 

ANALYSIS OP VARIANCE 


FOR HOLISTIC SCORES: 

/ 


REWRITING- EFFECTS 




, 'Source 


1 df 


1 MS | 


Fl t | F2 




Reader (R) 

* 


1 • 11 


1 * .448 I 






Content (C) 


1 1 


1 9.860 I 


37.78*** | 31.*70*** 


Organization (0) 


1 1 


1 5.195 I 


29.69*** I 16.70*** 


Sentence Structure (SS) 


* 1 1 


1 1.5 1, 


2.54 I 4.82 




Mechanics fM) 


1 1 


1 5.042 | 


9.77** | 16,21***" 


f 

s, 






C X SS 


1 1 


1 1.960 1 


U- 6.30 


C -X M 


1 1 


1 .990 | 


1 3.18 


0 X SS 


1 1 


1 • 3.767 , |- 


1 v 1 12.11** 




0 X M 


1 1. 


1 6.155 | 


. |„ 19.79***- 


SS X M 


1 1 


1 • . ogi | 




Reader Interactions 




A 






R X C 


.1 11 


1 .261 I 


V 




R X 0 - 


' 1 11 


'1 .175 I 






R X SS 


1 11 


1 .591 I 






' R'X M ' 


1 11 


1 .516 I 


• 




Residual 


1 31 


1 .311 | 







** p < ,01 l,ildf Fl = 9.65 
*** p <, .001 l,lldf- Fl f 19-69 



- **p < .01 l,31df~ F2 = 7.56 
***p < .001 },31df ' F2 =13.29 



Fl based on R by Source 

F2 based on residual 
error variance 
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TABLE 3 



MEAN HOLISTIC JTJK&ENIS (4 = highest, 1 = lowest) 





Strong , 


Weak 


Difference 


'Content -* - J 


N = 64 


, N = 32 




- *c . * - - 


2-.375 1 


1. 313 

■ 


1.06 


Organization * ^ 


N = 32 


N =.'64 






2.656 * 


l.V»3 


.95 


* \ 

Sent* Str# . 


N * 48 


N = 48 




* h 


. 2.146 - 


I.896 


.25 


* * 

- - Meoflfcrri cs - * • - - ' • 


N = 48 


N s 48 


.46 


2.250 


1.792 



N = 96 rewritten essays 



TABLE 4 



-J 



EFFECTS OF INTERACTION BETWEEN 
ORGANIZATION AND MECHANICS AND SENTENCE STRUCTURE 
ON HpLIfi^IC SCORE (4 =■ highest, 1 ="lowest) 

Organization 



M 

e 
c 
h 
a 

.n 
i 
c 



S 
e 
n 
t 
S 
S 
t 
r 
u 
c 
t 
u 
r 
e 



* 


Strong 


Weak 


Strong 


1 X 3.124 I 
ISD (.957) | 

! J 


1.183 
(.592) 


Weak 
• 


1 2.188 | 
1 (,834) I 

1 '. «" ' '1 


1.594 
(.615) . 


Differ.. 


1 .936' | 
l> 1 


.219 


• 






^Strong 


1 3.000 | 
1 (1.03) 1 


1.719 
(.581) 


\ 






Weak 


1 2.313 | 
1 (.873) | 


1.688 
(.644)' 


Differ . X 


1 .687 | 


.031 



• organization X 
mechanics 

p < •001 

i 



organization X 
sentence structure 
p < .01 4 



31 



, TABLE 5 



READER-REWRITER MATCH/MISMATCH 



Content (C) 
Organization (0) • 
Sentence Structure {SS) 
Me cynics (M^ 



% 'Match 

80.2 
83.3 
84.4 
90.6 



%Misraatch 

. 19.8 ' 
16.7 
- 15.6 
09.4 



— v 




TABLE 



ANALYSIS OF VARIANCE FOR HOLISTIC SCORES: 
PERCEIVED REWRITING EFFECTS 



32 



Source 


1 df 1 


MS 


t Fl I 


0 

F2 


Reader (R) 


1 11 1 


.•377. 




• 


ContentPerceived (CP) 


1 1 1 


12.537 


I '41.65***1 


31.74*** 


Organ, Perceived (OP) 


1 1- 1 


5. '566 


1 19.81***1 


14.09*** 


SenSt. Perceived (SSP) 


1 1 1 


3.501 


1 7.34** | 


8.86***-' 


Mech. Perceived (MP) 


1 . 1 1 


1*132 


1* 3.48 \ 


2.87* 












CP X OP 


1 1 1 


\481 




1.22 


CP X SSP 


1 1 1 


.131 




.33 


CP X MP 


1 1 1 


.146 


| | 


.37 


OP X SSP, 


111 


.939 


|— | 

{ / * 


2.38 


OlP X HP • 

* 


I i I 


.034 




.09 


SSP X MP 


1 i I 


.368 




.93 


* * 
Reader Interactions 

R X CP 


*■ *. 

1 ii I 


.301 


1 ** 


\ 


R X OP 


1 ii I 


.281 


I ' i 

* 4 




R X SSP 


t ii 1 


.477 


\ \ 




R X MP 


1 ii I 


.325 






Residual . 




.395 







*p t < .05 
. **p < .01 
***p < .001 

*p < .05 
**p </r01 
***p-T .001 



l,lldf Fl -* 4.84 
l,lldf Fl = 9.65 
.l,lldf^ Fl = 19.69 

l>31df ?t = 4.^17 
l,.31df F2\^ 7.56 
l f ,31df 'P2 A13.29 



Fl based on R by Source 
variance 

/ 

F2 based on residual 
error variance » 



TABLE 7 



MEAN HOLISTIC JUDGMENTS: PERCEIVED REWRITING 
(4 = highest, 1 = lowest) . 



» K 1 


Strong 


Weak 


Difference 




N = 6^ | 


N = 32 | 




Content Percv'd 1 


2.578 I 


1.529" | 


1.05 1 




N = 32 | 


N = 6k- j 




Organ. Percv'd I 


2.719 | 


1.672 1 


1.05 | 




N a 48 j 


N l^ 8 1 




Se n t S t r • Pe r c v 1 d I 


2.340 | 


1./14 1 


.63 | 


Me^an.Percv'd | 


N = ~h8 j 


N = kS j 




2.356 | 


1.725 } | 


.63 | 








\ , J 














» 





N = 96 rewritten essays 



34 



3H 



Footnotes 



- , *This topic was first developed by the Calif ornia State 
Universities and College System for their Freshman English 
Equivalency Examination* * 

2 The method of -selection of students was extremedy comple 
and is detailed 'in the author^ dissertation. f 

3 Two raters from one of the four groups o£ raters had 
more difficulty than any of the other raters in the sample in 
matching their judgments of the strength and weakness of the 
four rewriting categories with the rewriters 1 'intention^. 
These two raters were type 3, previously judged to be among 
the 'least well qualified* Because they were two standard* 
deviations above the mean in the amount of mismatch betweeh 
their judgments, and the rewriters 1 intentions, I replaced - 
them with a better qualified pair: one type 1 and one type 2 
rater. These replacement ; raters performed the evaluations 
together. Analyses are based on the rating given by ths" 
replacement raters. 



